Unsupervised Learning of Morphology

نویسندگان

  • Harald Hammarström
  • Lars Borin
چکیده

some morphological pattern that recurs among the groups. Such emergent patterns provide enough clues for segmentation and can sometimes be formulated as rules or morphological paradigms. (c) Features and Classes: In this family of methods, a word is seen as made up of a set of features—n-grams in Mayfield and McNamee (2003) and McNamee and Mayfield (2007), and initial/terminal/mid-substring in De Pauw and Wagacha (2007). Features which occur on many words have little selective power across the words, whereas features which occur seldom pinpoint a specific word or stem. To formalize this intuition, Mayfield and McNamee and McNamee and Mayfield use TF-IDF, and De Pauw and Wagacha use entropy. Classifying an unseen word reduces to using its features to select which word(s) it may be morphologically related to. This decides whether the unseen word is a morphological variant of some other word, and allows extracting the “variation” by which they are related, such as an affix. (d) Phonological Categories and Separation: In this family of methods, the phonemes (approximated by graphemes) are first classed into categories, foremostly, vowel versus consonant. Thereafter, each word is separated into its vowel skeleton and its consonant skeleton, after which various

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Unsupervised Learning by Program Synthesis

We introduce an unsupervised learning algorithm that combines probabilistic modeling with solver-based techniques for program synthesis. We apply our techniques to both a visual learning domain and a language learning problem, showing that our algorithm can learn many visual concepts from only a few examples and that it can recover some English inflectional morphology. Taken together, these res...

متن کامل

Unsupervised Learning of Morphology by using Syntactic Categories

This paper presents a method for unsupervised learning of morphology that exploits the syntactic categories of words. Previous research [4][12] on learning of morphology and syntax has shown that both kinds of knowledge affect each other making it possible to use one type of knowledge to help the other. In this work, we make use of syntactic information i.e. Part-of-Speech (PoS) tags of words t...

متن کامل

An Unsupervised Learning Method for an Attacker Agent in Robot Soccer Competitions Based on the Kohonen Neural Network

RoboCup competition as a great test-bed, has turned to a worldwide popular domains in recent years. The main object of such competitions is to deal with complex behavior of systems whichconsist of multiple autonomous agents. The rich experience of human soccer player can be used as a valuable reference for a robot soccer player. However, because of the differences between real and simulated soc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2011